Nicholas Carlini, a research scientist at Anthropic, demonstrated that Claude Code can identify remotely exploitable security vulnerabilities within the Linux kernel. Most significantly, the AI discovered a heap buffer overflow in the NFS driver that had remained undetected for 23 years. By using a simple script to direct the model's attention to specific source files, Carlini was able to uncover complex bugs that require a deep understanding of intricate protocols. While the discovery highlights the growing power of large language models in cybersecurity, it also presents a new bottleneck: the massive volume of potential vulnerabilities found by AI requires significant manual effort from human researchers to validate and report.
This article details a hands-on experience with Nvidia's NemoClaw, a security-focused stack designed to enhance the safety of the OpenClaw AI platform. While NemoClaw introduces improvements like a sandbox model and aggressive policy filtering, the author finds it still falls short of being a reliable solution.
Bugs, limitations, and the inherent risks associated with OpenClaw's architecture—particularly its connection to external services—persist. The core issue remains that NemoClaw can secure the agent but cannot protect against malicious instructions embedded in external data sources like emails or messages.
The author concludes that while NemoClaw is a step forward, it doesn't fully address the fundamental security concerns surrounding OpenClaw.
NIST is launching a new project around standards for artificial intelligence agents, seeking feedback on the secure use of the rapidly evolving technology. The initiative focuses on security concerns arising from the autonomous nature of AI agents and aims to foster interoperability and public trust. It includes a request for information on AI agent security and a draft concept paper on software and AI agent identity and authorization.